-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doc: KIND complex network scenarios #1337
Conversation
/assign @BenTheElder |
@aojea: GitHub didn't allow me to request PR reviews from the following users: neiljerram, howardjohn, qinqon. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of typos that I spotted.
|
||
## Multiple clusters | ||
|
||
As we explained before, all KIND clusters are sahring the same docker network, that means that all the cluster nodes have direct connectivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "sahring"
|
||
As we explained before, all KIND clusters are sahring the same docker network, that means that all the cluster nodes have direct connectivity. | ||
|
||
If we want to spawn multiple cluster and provide Pod to Pod connectivity between different clusters, first we have to configure the cluster networking parameters to avoid address overlapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"multiple clusters"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0 | ||
{{< /codeFromInline >}} | ||
|
||
That means that Pods will be able to reach other dockers containers that does not belong to any KIND cluster, however, the docker container will not be able to answer to the Pod IP address until we intall the correspoding routes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reach other dockers containers
maybe should change to:
reach other docker containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same doubt, but with the amount of "containers", pods, ... that we have in these virtualized environments I think that maybe is good be explicit about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aojea, tao12345666333 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@aojea Would this be a good time to talk more about your comment at #939 (comment) ? I understand that you and @BenTheElder had concerns about my proposal at the time, but I am not sure you are right that my objective can be easily achieved in existing other ways. |
/hold @neiljerram my understanding is that you want to automate in KIND:
if that's correct I can document how to do it, it seems easy to automate with a script or in the same way that kubeadm friends are doing with kinder https://github.com/kubernetes/kubeadm/tree/master/kinder#usage IMHO that seems a very intrusive and specific change to target multihomed nodes environments, that are not very common on cloud environments ... bear in mind that main goal of KIND is testing Kubernetes. |
Thanks @aojea for your interest in this. At the time of that PR, I was modelling a multi-homed infrastructure, with two independent planes of connectivity between any two nodes. Obviously the idea is that if one of the connectivity planes fails in some way, we still have connectivity between all the nodes over the other plane. My reason for thinking that this needs integration in KIND is as follows.
WDYT? Am I still missing other possible approaches here? |
/hold cancel
yeah, I totally understand your point from the Network engineering perspective, but that setup needs a routing protocol to work and do the failover, I know that Calico and kube-router gives that possibility allowing you to peer with the leaf switches, but as I've said before this is a very specific scenario for bare metal environments, where you don't have an IaaS handling the infrastructure. For "cloud-native" environments, the IaaS + cloud-controller-manager and Kubernetes + controller loops handle the "resilience" of the environment, i.e. the VMs only need one interface because the network is "virtual" and the IaaS handles it, for the Kubernetes workloads the controller loops handle the pods and services, restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve. Basically everything is cattle .... or should be 😉 Specifically to KIND, the network is a Linux bridge, everything is SW and in the same host, if one interface or bridge fails we'll have a bigger problem 😄 |
/retest |
@aojea Many thanks. So if I understand correctly, I think your position can be summarised as:
Is that right? |
my point is that I don't see the need to implement it in KIND because you can do it just after the cluster creation, this is an example with bash, using python, go , ... you can easily build much more complex topologies and parametrize it: LOOPBACK_PREFIX="1.1.1."
MY_BRIDGE="my_net2"
MY_ROUTE=10.0.0.0/24
MY_GW=172.16.17.1
# Create 2nd network
docker network create ${MY_BRIDGE}
# Create kubernetes cluster
kind create cluster
# Configure nodes to use the second network
for n in $(kind get nodes); do
# Connect the node to the second network
docker network connect ${MY_BRIDGE} ${n}
# Configure a loopback address
docker exec ${n} ip addr add ${LOOPBACK_PREFIX}${i}/32 dev lo
# Add static routes
docker exec ${n} ip route add ${MY_ROUTE} via {$MY_GW}
done
Is not just that, KIND is gating kubernetes and is used as CI in a big amount of the Kubernetes ecosystem projects, I'm afraid that the risk of introducing this change could affect the stability of the project, hence all these CIs . You can't imagine the amount of hours that @BenTheElder mainly, @amwat, I and others have spent debugging flakiness and optimizing KIND |
New changes are detected. LGTM label has been removed. |
I described in my previous comment why this is not good enough: I need the Kubernetes control plane connections to be using loopback addresses, and IIRC those are set up during cluster creation. Do you think I've got something wrong there? |
ok, now I got it, sorry for the confusion but I wasn't understanding your point ... It can be done after the cluster setup, is a bit tricky though. When creating the cluster add the loopback IP address you are going to use for the control-plane to the certificate SAN (the apiserver binds to "all-interfaces" by default)
After the cluster has been created, modify the kube-apiserver
and then change in all the kubelet the
and restart them |
43600df
to
763f698
Compare
--- | ||
# Using KIND to emulate complex network scenarios [Linux Only] | ||
|
||
KIND runs Kubernetes cluster in Docker, and leverages Docker networking for all the network features: portmapping, IPv6, containers connectivity, ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...connectivity, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
portmapping -> port mapping
valid_lft forever preferred_lft forever | ||
{{< /codeFromInline >}} | ||
|
||
Docker also creates iptables NAT rules on the docker host that masquerade the traffic from the containers connected to docker0 bridge to connect to the outside world. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the docker host -> the Docker host
|
||
## Multiple clusters | ||
|
||
As we explained before, all KIND clusters are sharing the same docker network, that means that all the cluster nodes have direct connectivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker network -> Docker network
|
||
{{< /codeFromInline >}} | ||
|
||
Then we just need to install the routes obtained from cluterA in each node of clusterB and viceversa: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
viceversa -> vice versa
|
||
### Example: Multiple network interfaces and Multi-Home Nodes | ||
|
||
There can be scenarios that requite multiple interfaces in the KIND nodes to test multi-homing, VLANS, CNI plugins, ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...CNI plugins, etc.
inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0 | ||
{{< /codeFromInline >}} | ||
|
||
That means that Pods will be able to reach other Docker containers that does not belong to any KIND cluster, however, the Docker container will not be able to answer to the Pod IP address until we install the correspoding routes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correspoding -> corresponding
- --advertise-address=172.17.0.4 | ||
``` | ||
|
||
and then change in all the nodes the kubelet `node-ip` flag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and then change the node-ip
flag for the kubelets on all the nodes:
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/run/containerd/containerd.sock --fail-swap-on=false --node-ip=172.17.0.4" | ||
``` | ||
|
||
and restart them `systemctl restart kubelet` to use the new config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally restart the kubelets to use the new configuration with systemctl restart kubelet
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
important to note here is that calling kubeadm init / join again on the node will override /var/lib/kubelet/kubeadm-flags.env
. alternative is to use /etc/default/kubelet
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#the-kubelet-drop-in-file-for-systemd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me add it as a note, due to the ephemeral nature of the nodes I don't expect people to issue those commands ... but 🤷♂️
|
||
It's important to note that calling `kubeadm init / join` again on the node will override `/var/lib/kubelet/kubeadm-flags.env`. An [alternative is to use /etc/default/kubelet](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#the-kubelet-drop-in-file-for-systemd).S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing S
after the .
0b2794b
to
7defa85
Compare
/retest |
inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0 | ||
{{< /codeFromInline >}} | ||
|
||
That means that Pods will be able to reach other Docker containers that does not belong to any KIND cluster, however, the Docker container will not be able to answer to the Pod IP address until we install the corresponding routes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are referring to multiple containers, use do
instead of does
sorry for the immense delay. I'd been hoping to get #148 done faster. I'd still like to hold off detailing networking internals until after I'm doing taking a swing at changing them :D |
/hold |
ip route add 10.110.2.0/24 via 172.17.0.2 | ||
|
||
$kubectl --context kind-clusterB get nodes -o=jsonpath='{range .items[*]}{"ip route add "}{.spec.podCIDR}{" via "}{.status.addresses[?(@.type=="InternalIP")].address}{"\n"}{end}' | ||
ip route add 10.120.0.0/24 via 172.17.0.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be 220? Also why are there three results here when each cluster has two nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, good catch on both things, is 220 and the config should have 3 nodes
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There were several PR and demand to implement this in KIND.
However, I think that KIND can serve better as a building block for complex scenarios that can be easily scripted, avoiding adding complexity to the project.